GTM-UVigo System for Multimodal Person Discovery in Broadcast TV Task at MediaEval 2016
نویسندگان
چکیده
In this paper, we present the system developed by GTMUVigo team for the Multimedia Person Discovery in Broadcast TV task at MediaEval 2016. The proposed approach consists in a novel strategy for person discovery which is not based on speaker and face diarisation as in previous works. In this system, the task is approached as a person recognition problem: there is an enrolment stage, where the voice and face of each discovered person are detected and, for each shot, the most suitable voice and face are assigned using the i-vector paradigm. These two biometric modalities are combined by decision fusion.
منابع مشابه
GTM-UVigo Systems for Person Discovery Task at MediaEval 2015
In this paper, we present the systems developed by GTMUVigo team for the Multimedia Person Discovery in Broadcast TV task at MediaEval 2015. The systems propose two different strategies for person discovery in audio through speaker diarization (one based on an online clustering strategy with error correction using OCR information and the other based on agglomerative hierarchical clustering) as ...
متن کاملTokyo Tech at MediaEval 2016 Multimodal Person Discovery in Broadcast TV task
This paper describes our diarization system for the Multimodal Person Discovery in Broadcast TV task of the MediaEval 2016 Benchmark evaluation campaign [1]. The goal of this task is naming speakers, who are appearing and speaking simultaneously in the video, without prior knowledge. Our diarization system relies on face diarization approach. We extract deep features from a face every 0.5 secon...
متن کاملMultimodal Person Discovery in Broadcast TV at MediaEval 2016
We describe the“Multimodal Person Discovery in Broadcast TV” task of MediaEval 2016 benchmarking initiative. Participants are asked to return the names of people who can be both seen as well as heard in every shot of a collection of videos. The list of people is not known a priori and their names has to be discovered in an unsupervised way from media content using text overlay or speech transcr...
متن کاملPERCOLATTE : A Multimodal Person Discovery System in TV Broadcast for the Medieval 2015 Evaluation Campaign
This paper describes the PERCOLATTE participation to MediaEval 2015 task: “Multimodal Person Discovery in Broadcast TV” which requires developing algorithms for unsupervised talking face identification in broadcast news. The proposed approach relies on two identity propagation strategies both based on document chaptering and restricted overlaid names propagation rules. The primary submission sh...
متن کاملCombining Audio Features and Visual I-Vector @ MediaEval 2015 Multimodal Person Discovery in Broadcast TV
This paper describes our diarization system for the Multimodal Person Discovery in Broadcast TV task of the MediaEval 2015 Benchmark evaluation campaign [1]. The goal of this task is naming speakers, who are appearing and speaking simultaneously in the video, without prior knowledge. Our diarization system is based on multimodal approach to combine audio and visual informations. We extract feat...
متن کامل